在网络中找到最有影响力的节点是一个计算困难的问题,其中有多种基于网络的问题的几个可能的应用程序。尽管已经提出了几种解决影响最大化问题(IM)问题的方法,但当网络大小增加时,它们的运行时间通常会缩放较差。在这里,我们提出了一种基于网络缩减的原始方法,该方法允许多目标进化算法(MOEA)在减少的比例网络上解决IM问题,同时保留原始网络的相关属性。然后使用基于中心度指标(例如Pagerank)的机制将缩小的解决方案升级到原始网络。我们在八个大型网络(包括$ \ sim $ 50k节点)上的结果证明了该方法的有效性,与原始网络所需的时间相比,运行时增益超过10倍,最高为82美元\%与CELF相比,$减少了时间。
translated by 谷歌翻译
许多工业部门都在收集大传感器数据。借助最近用于处理大数据的技术,公司可以利用此功能来自动故障检测和预防。我们提出了第一种完全自动化的故障分析方法,来自具有连续变量的原始观察数据中的机器学习故障树。我们的方法缩放得很好,并在荷兰的国内加热器操作的现实世界中进行了测试,其中有3100万个独特的加热器读数,每个读数包含27个传感器和11个失败变量。我们的方法建立在上一个之前的两个过程中:C4.5决策树学习算法和升降故障树学习算法从布尔数据中。 C4.5预处理每个连续变量:它学习了一个最佳数值阈值,该阈值区分了顶级系统的错误和正常操作。这些阈值可以离散变量,从而使升力能够学习故障树,以建模系统的根部故障机理并可以解释。我们获得了11个故障变量的故障树,并通过两种方式进行评估:具有显着性评分,并且在定性上,与域专家进行定性评估。一些学到的断层树几乎具有最大的显着性(高于0.95),而另一些则具有中低意义(约0.30左右),这反映了从大型,嘈杂,现实世界传感器数据中学习的困难。域专家确认,断层树模型变量之间有意义的关系。
translated by 谷歌翻译
发现新的超链接使Web爬网程序能够找到尚未索引的新页面。这对于集中的爬行者来说尤为重要,因为他们努力提供对网络的特定部分的全面分析,从而优先考虑发现内容的变化的新页面。在文献中,通常同​​时考虑超链接和内容的变化。但是,还有证据表明这两种改变不一定是相关的。此外,关于预测变化的许多研究假设页面的长期可用,这在实践中是无法实现的。这项工作的目的是提供一种方法来使用短历史有效地检测新的链接。为此,我们使用一周的间隔使用十个爬网的数据集。我们的研究包括三个部分。首先,我们通过分析新的倒出数量的经验属性来获得数据的洞察力。我们观察到这些属性平均随着时间的推移稳定,但在目标页面内外页面的超链接出现的超链接之间存在很大的差异(分别分别是内部和外部倒降)。接下来,我们为三个目标提供统计模型:链路变化率,新链接的存在以及新链接的数量。这些模型包括文献中早些时候使用的功能,以及在这项工作中引入的新功能。我们分析了特征之间的相关性,并调查了他们的信息。一个值得注意的发现是,如果目标页面的历史不可用,那么我们的新功能,代表相关页面的历史,对于目标页面中的新链接最预测。最后,我们将排名方法作为聚焦爬虫的准则,以有效地发现新页面,这对相应的目标实现了出色的性能。
translated by 谷歌翻译
在世界各地的传统天文中,夜空中的星星与星座相连 - 天体球体上的象征性表示,富含含义和实际角色。在一些培养物中,星座表示为线或连接 - 点数据,这是限制到固定背景的空间网络,而是在他们选择的星星和线路中自由。我们首先定义星座的视觉签名:丰富的多维复杂度度量捕获网络,空间和亮度功能。然后,我们回答问题:是文化,文化类型,或与他们的线数字的视觉签名强烈相关的天空区域,因此可能已经确定了它们的形状?我们分析了来自所有大陆的50个天文文化的1591个线条数字和悠久的历史,发现线条数据在视觉签名中的亲密关系中形成了七个不同的群集,并得出以下结论。少量个别文化具有独特的视觉签名。口腔天文在网络和空间特征中是多样的,但使用更明亮的星星。用于导航,宗教歧视和农业/猎人 - 采集时间保持时间的星座是相似的,但来自中美洲和美索岛的星座具有明显的视觉签名。我们发现明确的跨文化相似之处,亚洲传统远离美市奥岛,南方和美国,澳大利亚人和波利尼西亚传统。我们还在许多天空地区发现广泛的视觉签名:大多数广泛使用的星星周围有不同的线路设计。
translated by 谷歌翻译
Drug dosing is an important application of AI, which can be formulated as a Reinforcement Learning (RL) problem. In this paper, we identify two major challenges of using RL for drug dosing: delayed and prolonged effects of administering medications, which break the Markov assumption of the RL framework. We focus on prolongedness and define PAE-POMDP (Prolonged Action Effect-Partially Observable Markov Decision Process), a subclass of POMDPs in which the Markov assumption does not hold specifically due to prolonged effects of actions. Motivated by the pharmacology literature, we propose a simple and effective approach to converting drug dosing PAE-POMDPs into MDPs, enabling the use of the existing RL algorithms to solve such problems. We validate the proposed approach on a toy task, and a challenging glucose control task, for which we devise a clinically-inspired reward function. Our results demonstrate that: (1) the proposed method to restore the Markov assumption leads to significant improvements over a vanilla baseline; (2) the approach is competitive with recurrent policies which may inherently capture the prolonged effect of actions; (3) it is remarkably more time and memory efficient than the recurrent baseline and hence more suitable for real-time dosing control systems; and (4) it exhibits favorable qualitative behavior in our policy analysis.
translated by 谷歌翻译
Learning policies from fixed offline datasets is a key challenge to scale up reinforcement learning (RL) algorithms towards practical applications. This is often because off-policy RL algorithms suffer from distributional shift, due to mismatch between dataset and the target policy, leading to high variance and over-estimation of value functions. In this work, we propose variance regularization for offline RL algorithms, using stationary distribution corrections. We show that by using Fenchel duality, we can avoid double sampling issues for computing the gradient of the variance regularizer. The proposed algorithm for offline variance regularization (OVAR) can be used to augment any existing offline policy optimization algorithms. We show that the regularizer leads to a lower bound to the offline policy optimization objective, which can help avoid over-estimation errors, and explains the benefits of our approach across a range of continuous control domains when compared to existing state-of-the-art algorithms.
translated by 谷歌翻译
The core operation of current Graph Neural Networks (GNNs) is the aggregation enabled by the graph Laplacian or message passing, which filters the neighborhood information of nodes. Though effective for various tasks, in this paper, we show that they are potentially a problematic factor underlying all GNN models for learning on certain datasets, as they force the node representations similar, making the nodes gradually lose their identity and become indistinguishable. Hence, we augment the aggregation operations with their dual, i.e. diversification operators that make the node more distinct and preserve the identity. Such augmentation replaces the aggregation with a two-channel filtering process that, in theory, is beneficial for enriching the node representations. In practice, the proposed two-channel filters can be easily patched on existing GNN methods with diverse training strategies, including spectral and spatial (message passing) methods. In the experiments, we observe desired characteristics of the models and significant performance boost upon the baselines on 9 node classification tasks.
translated by 谷歌翻译
Using massive datasets to train large-scale models has emerged as a dominant approach for broad generalization in natural language and vision applications. In reinforcement learning, however, a key challenge is that available data of sequential decision making is often not annotated with actions - for example, videos of game-play are much more available than sequences of frames paired with their logged game controls. We propose to circumvent this challenge by combining large but sparsely-annotated datasets from a \emph{target} environment of interest with fully-annotated datasets from various other \emph{source} environments. Our method, Action Limited PreTraining (ALPT), leverages the generalization capabilities of inverse dynamics modelling (IDM) to label missing action data in the target environment. We show that utilizing even one additional environment dataset of labelled data during IDM pretraining gives rise to substantial improvements in generating action labels for unannotated sequences. We evaluate our method on benchmark game-playing environments and show that we can significantly improve game performance and generalization capability compared to other approaches, using annotated datasets equivalent to only $12$ minutes of gameplay. Highlighting the power of IDM, we show that these benefits remain even when target and source environments share no common actions.
translated by 谷歌翻译
Reinforcementlearning(RL)folkloresuggeststhathistory-basedfunctionapproximationmethods,suchas recurrent neural nets or history-based state abstraction, perform better than their memory-less counterparts, due to the fact that function approximation in Markov decision processes (MDP) can be viewed as inducing a Partially observable MDP. However, there has been little formal analysis of such history-based algorithms, as most existing frameworks focus exclusively on memory-less features. In this paper, we introduce a theoretical framework for studying the behaviour of RL algorithms that learn to control an MDP using history-based feature abstraction mappings. Furthermore, we use this framework to design a practical RL algorithm and we numerically evaluate its effectiveness on a set of continuous control tasks.
translated by 谷歌翻译
抽象已被广泛研究,以提高增强学习算法的效率和概括。在本文中,我们研究了连续控制环境中的抽象。我们将MDP同态的定义扩展到连续状态空间中的连续作用。我们在抽象MDP上得出了策略梯度定理,这使我们能够利用环境的近似对称性进行策略优化。基于该定理,我们提出了一种能够使用Lax Bisimulation Mimulation Mimulation Mimulation Mimulation Mimulation Mimulation Mimulation Mimulation Mimulation Mimulation Mimulation Mimulation Mimulation Mimulation Mimulation Mimulation Mimulation Mimulation Mimulation Mimulation Mimulation Mimulation。我们证明了我们方法对DeepMind Control Suite中基准任务的有效性。我们的方法利用MDP同态来表示学习的能力会导致从像素观测中学习时的性能。
translated by 谷歌翻译